Efficient Inference of Haplotypes from Genotypes on a Pedigree with Mutations and Missing Alleles (Extented Abstract)

نویسندگان

  • Wei-Bung Wang
  • Tao Jiang
چکیده

Driven by the international HapMap project, the haplotype inference problem has become an important topic in the computational biology community. In this paper, we study how to efficiently infer haplotypes from genotypes of related individuals as given by a pedigree. Our assumption is that the input pedigree data may contain de novo mutations and missing alleles but is free of genotyping errors and recombinants, which is usually true for tightly linked markers. We formulate the problem as a combinatorial optimization problem, called the minimum mutation haplotype configuration (MMHC) problem, where we seek haplotypes consistent with the given genotypes that incur no recombinants and require the minimum number of mutations. This extends the well studied zero-recombinant haplotype configuration (ZRHC) problem. Although ZRHC is polynomial-time solvable, MMHC is NP-hard. We construct an integer linear program (ILP) for MMHC using the system of linear equations over the field F (2) that has been developed recently to solve ZRHC. Since the number of constraints in the ILP is large (exponentially large in the general case), we present an incremental approach for solving the ILP where we gradually add the constraints to a standard ILP solver until a feasible haplotype configuration is found. Our preliminary experiments on simulated data demonstrate that the method is very efficient on large pedigrees and can infer haplotypes very accurately as well as recover most of the mutations and missing alleles correctly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring Haplotypes from genotypes on a Pedigree with mutations, genotyping Errors and Missing Alleles

Inferring the haplotypes of the members of a pedigree from their genotypes has been extensively studied. However, most studies do not consider genotyping errors and de novo mutations. In this paper, we study how to infer haplotypes from genotype data that may contain genotyping errors, de novo mutations, and missing alleles. We assume that there are no recombinants in the genotype data, which i...

متن کامل

Efficient inference of haplotypes from genotypes on a large animal pedigree.

We present a simple algorithm for reconstruction of haplotypes from a sample of multilocus genotypes. The algorithm is aimed specifically for analysis of very large pedigrees for small chromosomal segments, where recombination frequency within the chromosomal segment can be assumed to be zero. The algorithm was tested both on simulated pedigrees of 155 individuals in a family structure of three...

متن کامل

HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination

MOTIVATION Haplotype reconstruction is an essential step in genetic linkage and association studies. Although many methods have been developed to estimate haplotype frequencies and reconstruct haplotypes for a sample of unrelated individuals, haplotype reconstruction in large pedigrees with a large number of genetic markers remains a challenging problem. METHODS We have developed an efficient...

متن کامل

Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming

We study the problem of reconstructing haplotype configurations from genotypes on pedigree data with missing alleles under the Mendelian law of inheritance and the minimum-recombination principle, which is important for the construction of haplotype maps and genetic linkage/association analyses. Our previous results show that the problem of finding a minimum-recombinant haplotype configuration ...

متن کامل

A new method for haplotype inference including full-sib information.

Recent literature has suggested that haplotype inference through close relatives, especially from nuclear families, can be an alternative strategy in determining linkage phase and estimating haplotype frequencies. In the case of no possibility to obtain genotypes for parents, and only full-sib information being used, a new approach is suggested to infer phase and to reconstruct haplotypes. We p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009